A revised method to detect erroneous characters wrongly substituted, deleted, and inserted at the end position in Japanese sentences and ‘bunsetsu’s
Identifieur interne : 000517 ( Main/Exploration ); précédent : 000516; suivant : 000518A revised method to detect erroneous characters wrongly substituted, deleted, and inserted at the end position in Japanese sentences and ‘bunsetsu’s
Auteurs : Chikahiro Araki [Japon] ; Mikio Mori [Japon] ; Shuji Taniguchi [Japon]Source :
- IEEJ Transactions on Electrical and Electronic Engineering [ 1931-4973 ] ; 2011-03.
English descriptors
- KwdEn :
Abstract
A method to detect the erroneous characters wrongly substituted, deleted, and inserted at the interior location of Japanese sentences and ‘bunsetsu’s using mth‐order Markov chain model has been proposed earlier and was found to be useful in detecting these erroneous characters. However, with this method it is difficult to detect erroneous characters at the end position of Japanese sentences and ‘bunsetsu’s, because the Markov chain probabilities of erroneous characters at the end position of sentences and ‘bunsetsu’s, do not remain smaller than the critical value T the same number of times. This paper proposes a method to detect erroneous characters located at the end position of sentences and ‘bunsetsu’s using the ‘skipped Markov chain model’ in addition to the ‘connected Markov chain model’. From experiments with newspaper articles, the proposed method is shown to be useful to correct erroneous characters located at the end position of sentences and ‘bunsetsu’s. © 2011 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
Url:
DOI: 10.1002/tee.20640
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000B97
- to stream Istex, to step Curation: 000B82
- to stream Istex, to step Checkpoint: 000174
- to stream Main, to step Merge: 000523
- to stream Main, to step Curation: 000517
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">A revised method to detect erroneous characters wrongly substituted, deleted, and inserted at the end position in Japanese sentences and ‘bunsetsu’s</title>
<author><name sortKey="Araki, Chikahiro" sort="Araki, Chikahiro" uniqKey="Araki C" first="Chikahiro" last="Araki">Chikahiro Araki</name>
</author>
<author><name sortKey="Mori, Mikio" sort="Mori, Mikio" uniqKey="Mori M" first="Mikio" last="Mori">Mikio Mori</name>
</author>
<author><name sortKey="Taniguchi, Shuji" sort="Taniguchi, Shuji" uniqKey="Taniguchi S" first="Shuji" last="Taniguchi">Shuji Taniguchi</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:BAD0D4B84B18B067C1F27DD2A75B94739E05307E</idno>
<date when="2011" year="2011">2011</date>
<idno type="doi">10.1002/tee.20640</idno>
<idno type="url">https://api.istex.fr/document/BAD0D4B84B18B067C1F27DD2A75B94739E05307E/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000B97</idno>
<idno type="wicri:Area/Istex/Curation">000B82</idno>
<idno type="wicri:Area/Istex/Checkpoint">000174</idno>
<idno type="wicri:doubleKey">1931-4973:2011:Araki C:a:revised:method</idno>
<idno type="wicri:Area/Main/Merge">000523</idno>
<idno type="wicri:Area/Main/Curation">000517</idno>
<idno type="wicri:Area/Main/Exploration">000517</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">A revised method to detect erroneous characters wrongly substituted, deleted, and inserted at the end position in Japanese sentences and ‘bunsetsu’s</title>
<author><name sortKey="Araki, Chikahiro" sort="Araki, Chikahiro" uniqKey="Araki C" first="Chikahiro" last="Araki">Chikahiro Araki</name>
<affiliation wicri:level="1"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Department of Human and Artificial Intelligence Systems, Graduate School of Engineering, University of Fukui, 3‐9‐1 Bunkyo, Fukui‐shi 910‐8507</wicri:regionArea>
<wicri:noRegion>Fukui‐shi 910‐8507</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Mori, Mikio" sort="Mori, Mikio" uniqKey="Mori M" first="Mikio" last="Mori">Mikio Mori</name>
<affiliation wicri:level="1"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Department of Information and Media Engineering, Graduate School of Engineering, University of Fukui, 3‐9‐1 Bunkyo, Fukui‐shi 910‐8507</wicri:regionArea>
<wicri:noRegion>Fukui‐shi 910‐8507</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Taniguchi, Shuji" sort="Taniguchi, Shuji" uniqKey="Taniguchi S" first="Shuji" last="Taniguchi">Shuji Taniguchi</name>
<affiliation wicri:level="1"><country xml:lang="fr">Japon</country>
<wicri:regionArea>Department of Information and Media Engineering, Graduate School of Engineering, University of Fukui, 3‐9‐1 Bunkyo, Fukui‐shi 910‐8507</wicri:regionArea>
<wicri:noRegion>Fukui‐shi 910‐8507</wicri:noRegion>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">IEEJ Transactions on Electrical and Electronic Engineering</title>
<title level="j" type="abbrev">IEEJ Trans Elec Electron Eng</title>
<idno type="ISSN">1931-4973</idno>
<idno type="eISSN">1931-4981</idno>
<imprint><publisher>Wiley Subscription Services, Inc., A Wiley Company</publisher>
<pubPlace>Hoboken</pubPlace>
<date type="published" when="2011-03">2011-03</date>
<biblScope unit="volume">6</biblScope>
<biblScope unit="issue">2</biblScope>
<biblScope unit="page" from="168">168</biblScope>
<biblScope unit="page" to="172">172</biblScope>
</imprint>
<idno type="ISSN">1931-4973</idno>
</series>
<idno type="istex">BAD0D4B84B18B067C1F27DD2A75B94739E05307E</idno>
<idno type="DOI">10.1002/tee.20640</idno>
<idno type="ArticleID">TEE20640</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">1931-4973</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Markov chain model</term>
<term>deletion error</term>
<term>error detection</term>
<term>insertion error</term>
<term>skipped Markov chain model</term>
<term>substitution error</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">A method to detect the erroneous characters wrongly substituted, deleted, and inserted at the interior location of Japanese sentences and ‘bunsetsu’s using mth‐order Markov chain model has been proposed earlier and was found to be useful in detecting these erroneous characters. However, with this method it is difficult to detect erroneous characters at the end position of Japanese sentences and ‘bunsetsu’s, because the Markov chain probabilities of erroneous characters at the end position of sentences and ‘bunsetsu’s, do not remain smaller than the critical value T the same number of times. This paper proposes a method to detect erroneous characters located at the end position of sentences and ‘bunsetsu’s using the ‘skipped Markov chain model’ in addition to the ‘connected Markov chain model’. From experiments with newspaper articles, the proposed method is shown to be useful to correct erroneous characters located at the end position of sentences and ‘bunsetsu’s. © 2011 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.</div>
</front>
</TEI>
<affiliations><list><country><li>Japon</li>
</country>
</list>
<tree><country name="Japon"><noRegion><name sortKey="Araki, Chikahiro" sort="Araki, Chikahiro" uniqKey="Araki C" first="Chikahiro" last="Araki">Chikahiro Araki</name>
</noRegion>
<name sortKey="Mori, Mikio" sort="Mori, Mikio" uniqKey="Mori M" first="Mikio" last="Mori">Mikio Mori</name>
<name sortKey="Taniguchi, Shuji" sort="Taniguchi, Shuji" uniqKey="Taniguchi S" first="Shuji" last="Taniguchi">Shuji Taniguchi</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000517 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000517 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:BAD0D4B84B18B067C1F27DD2A75B94739E05307E |texte= A revised method to detect erroneous characters wrongly substituted, deleted, and inserted at the end position in Japanese sentences and ‘bunsetsu’s }}
This area was generated with Dilib version V0.6.32. |